🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🧠 Inference Serving

Request Batching, Model Loading, Throughput Optimization, Latency Management

Optimal Scheduling Algorithms for LLM Inference: Theory and Practice
arxiv.org·12h
🧠LLM Inference
Learning About Threads: An Essential Guide for Developers
hackernoon.com·4h
⚙️Mechanical Sympathy
Mithril launches omnicloud for compute and batch inference
mithril.ai·21h·
Discuss: Hacker News
🖥GPUs
Announcements for AI Hypercomputer: The latest infrastructure news for ML practitioners
cloud.google.com·33m
🖥GPUs
Hessian analysis with JAX: a platform-agnostic, high-performance approach
lesswrong.com·11h
🕯️Candle
SAT Requires Exhaustive Search
link.springer.com·20h·
Discuss: Hacker News
🧮SMT Solvers
How to Improve Availability Using Deployment Patterns ★
newsletter.systemdesign.one·4h
💧Litestream
Uncovering memory corruption in NVIDIA Triton (as a new hire)
blog.trailofbits.com·5h·
Discuss: Hacker News
🌐Pingora
Deep learning model predicts microsatellite instability in tumors and flags uncertain cases
medicalxpress.com·1h
🛡️AI Safety
BOOST: Bayesian Optimization with Optimal Kernel and Acquisition Function Selection Technique
arxiv.org·12h
📊Statistical Ranking
🚨 BREAKING: ElevenLabs just changed content creation forever.
threadreaderapp.com·1h
🎭Claude
From search to answer engines: How to optimize for the next era of discovery
searchengineland.com·2h
💳Content Monetization
We Built an MCP Server and These Are the Gotchas Nobody Talks About
cloudquery.io·1h·
Discuss: Hacker News
📋MCP
Context Guided Transformer Entropy Modeling for Video Compression
arxiv.org·12h
📊Embeddings
Cloud Repatriation: Why Enterprises Are Moving Workloads Off Hyperscalers
blog.min.io·18h
🌐Distributed systems
🎲 Enter the Matrix
blog.webb.page·19h
📄File Formats
Building, Fast and Slow
idiallo.com·8h
👨‍💻Software development practices
Attention was never enough: Tracing the rise of hybrid LLMs
ai21.com·4h·
Discuss: Hacker News
🧠LLM Inference
A New Concurrent ML in Guile Scheme
wingolog.org·7h·
Discuss: Hacker News
🧵Concurrency Models
Show HN: Software devs, I made a tool to make creating estimates less painful
devtimate.com·7h·
Discuss: Hacker News
🔧Developer tools
Loading...Loading more...
AboutBlogChangelogRoadmap